1 LRT Normal Example

Let’s see an example of the normal distribution likelihood ratio test.

Let’s sample \(n=20\) from a normal distribution N(-2, sd=2). This is the true underlying distribution.

We don’t know the true mean and we are trying to estimate it. Let’s weirdly assume we know the standard deviation \(\sigma=2\) (in real life we have to estimate this, too)

1.1 Histogram of data:

We’ve drawn a sample so we can plot the sample distribution. The MLE is the sample mean:

\(\bar{X}\) = -1.78

1.3 Where is the maximum likelihood estimator?

Under the null? Under the alternative? Under the whole space? For the likelihood ratio test we need the MLE under the null space and also the MLE under the entire parameter space.

1.4 MLE and LRT as \(H_0\) threshold varies

Let’s change our null hypotheses. How do the MLE and the LRT change?

plot_lrt <- function(mu_cut, lik_data, nn=20, true_mean=-2, true_sd=2) {
  
  mle_null  <- lik_data %>% filter(mu >= mu_cut) %>% filter(likelihood==max(likelihood))
  mle_alternative <- lik_data %>% filter(mu < mu_cut) %>% filter(likelihood==max(likelihood))
  mle_global <- lik_data %>% filter(likelihood==max(likelihood))
  
  lrt_stat <- (mle_null %>% pull(likelihood))/(mle_global %>% pull(likelihood))
  
  p <- ggplot(lik_data, aes(x=mu, y=likelihood))+
    geom_line() + 
    geom_area(aes(y=ifelse(mu > mu_cut, 1.1*max(likelihood), 0),x=mu),fill="red",alpha=0.2)+
    geom_area(aes(y=ifelse(mu <= mu_cut, 1.1*max(likelihood), 0),x=mu),fill="blue",alpha=0.2)+
    geom_area(aes(y=ifelse(mu > mu_cut, likelihood, 0),x=mu),fill="red",alpha=0.7)+
    geom_area(aes(y=ifelse(mu <= mu_cut, likelihood, 0),x=mu),fill="blue",alpha=0.7)+
    ggtitle(glue("Likelihood function of sample of X\n n={nn}, true mu={true_mean}, LRT = {signif(lrt_stat,3)}"))+
    annotate("text",x=-4, y=max(lik_data$likelihood),label=glue("H1: mu < {mu_cut}\n muhat_1 = {mle_alternative%>%pull(mu)} \n L={signif(mle_alternative%>%pull(likelihood),3)}"))+
    annotate("text",x=0, y=max(lik_data$likelihood),label=glue("H0: mu >= {mu_cut}\n muhat_0 = {mle_null%>%pull(mu)} \n L={signif(mle_null%>%pull(likelihood),3)}"))+
    geom_vline(xintercept=mle_null %>% pull(mu), color="orange")+
    geom_vline(xintercept=mle_alternative %>% pull(mu),color="green")
  return(p)
}


p1 <- plot_lrt(0, lik_data = lik_data)
p2 <- plot_lrt(-1, lik_data = lik_data)
p3 <- plot_lrt(-2, lik_data = lik_data)
p4 <- plot_lrt(-3, lik_data = lik_data)

#(p1 + p2) / (p3 + p4)
p1 + p2 + p3 + p4 + plot_layout(ncol=2)

1.5 Different samples from same underlying population, same H0 and H1

Let’s keep the null hypothesis the same:

\(H_0: \mu \geq -1.5\) vs \(H_1: \mu < -1.5\)

but draw randomly from the same distribution a few times. Our true mean is pretty close to the null hypothesis threshold. How does our conclusion change with each sample data?

2 Example 8.2.3, Exponential LRT

This is a shift exponential distribution:

\[f(x) = \exp{[-(x-\theta)]} I(x>\theta)\]

We can simulate under this distribution by simulating a variable \(X+\theta\) where \(X\) has the typical Exp(1) distribution.

Let’s assume \(\theta = 4\).

Let’s sample \(n=20\) from this distribution. The MLE of the \(\theta\) is the minimum of the sample data \(X_{(1)}\). In this sample it is, \(X_{(1)}\) = 4.14.

2.2 Calculate LRT

The likelihood ratio test is the ratio of the likelihood maximized under the null space and the likelihood maximized under the full parameter space.

2.3 Hypothesis where MLE \(X_{(1)} > \theta_0\).

Let’s test the hypotheses:

\(H_0: \theta \leq 3\) vs \(H_1: \theta > 3\)

Note that we have set \(\theta_0 = 3\) and we have the MLE \(X_{(1)} > \theta_0\).

2.4 Hypothesis where MLE \(X_{(1)} > \theta_0\) > true \(\theta\).

Let’s get close to the true \(\theta\) by testing the hypotheses:

\(H_0: \theta \leq 4.1\) vs \(H_1: \theta > 4.1\)

Note that we have set \(\theta_0 = 4.1\) and we still have the MLE \(X_{(1)} > \theta_0\).

2.5 Hypothesis where MLE \(X_{(1)} \leq \theta_0\).

Now let’s test the hypotheses:

\(H_0: \theta \leq 5\) vs \(H_1: \theta > 5\)

Note that we have set \(\theta_0 = 5\) and we have the MLE \(X_{(1)} \leq \theta_0\).